1 Introduction

The 2030 Agenda for Sustainable Development lays out an ambitious agenda for achieving gender equality and empowering all women and girls. This agenda, adopted by nearly all countries, is paired with a set of goals and targets, with concrete indicators, for meeting that agenda.

However, countries are falling short on reporting on these indicators. This article will diagnose the reasons for missing gender data: (1) lack of core/foundational surveys/census or relevant admin data source (2) lack of specialized surveys (3) flawed surveys/MIS (questionnaire design or bad forms in admin data) (4) lack of availability of data/stats from surveys/census/admin sources – so data exists but not easy to get the data/stats [poorly disseminated].

To begin, a discussion of the scope of missing data is given. Data is largely sourced from the UN Global SDG database. A set of roughly 91 indicators are investigated which are either part of SDG 5 on gender equity, or else are an SDG indicator that calls for a specific breakdown by sex.

Next, for each country, the share of these indicators that are available over a five year period from 2015 to 2020 are assessed. Globally, countries have a value for less than 40% of these indicators on average during this time period. For tier 1 indicators, the average is slightly higher with countries having a recent value for around 44% of the indicators. For tier 2 indicators, the average country has a recent value for only 27% of the indicators. For Goal 5 specifically, the average country has a recent value for around 34% of the indicators.

Additionally, for indicators that require a breakdown by sex, in several cases there are large gaps between the percentage of countries with a recent value for females and whether a value for any sex or both sexes combined exists. For instance, while SDG indicator 1.2.1, on the proportion of population living below the national poverty line, specifies that a breakdown by sex is available. This information is not currently available in the UN global SDG database. Additionally, for SDG indicator 1.3.1, on Proportion of population covered by social protection floors/systems, by sex, 79.4% of countries have a recent value for any/both sexes, but only 7.8% of countries have a recent value for females. To cite another example, for SDG indicator 4.1.1, Proportion of children and young people (a) in grades 2/3; (b) at the end of primary; and (c) at the end of lower secondary achieving at least a minimum proficiency level in (i) reading and (ii) mathematics, by sex, 62.4% of countries have a value for any/both sexes, but only 52.3% of countries have a value for females.

This paper then offers a discussion for why such gender gaps exist. First, the relationship between a country’s Statistical Performance Indicator (SPI) overall score and the country’s performance at producing gender statistics is examined. Next, the specific set of data sources used to produce gender SDG indicators is examined.

1.2 Coverage Gaps for Gender Indicators

In the following section, a discussion will be given on the current availability of gender data. Particular attention is given to the availability of data for reporting on the Sustainable Development Goals (SDGs), as it represents an agreed upon set of goals for countries to meet and because the vast majority of countries, through the SDG dialogue, have agreed to report on indicators related to these goals. Coverage gaps across regions, income groups, ages, and years are discussed.

While nearly all countries have agreed to report on the SDG indicator, and according to the UN global SDG indicators database major gaps exist in what indicators are available. The UN global SDG indicators database provides access to the data compiled by the UN for the annual Sustainable Development Goals Report tracking progress toward fulfilling the SDGs (Division 2021).

The methodology for this exercise follows (Dang and Serajuddin 2020), which summarises the availability of SDG indicators in the UN SDG database. For each SDG indicator, a country scores “1” if a value can be found in the database for that indicator between 2010 and 2020, and “0” if data is missing for that indicator between 2010 and 2020. The value for the region and income group in Figure 1 and Figure 2 below is the percentage of countries in that region or income group with a value found in the database. The percentages are raw percentages across regions or income groups and are not weighted for population. Only tier 1 indicators are included in this analysis.

1.2.1 Methodology

  1. Download the latest SDG indicator data from UN Stats (https://unstats.un.org/sdgs/indicators/en/#) using their API

  2. Transform the data so that for each indicator we can create a score documenting whether a value exists for the country in a year, whether the value is based on country data, country data adjusted, estimated, or modelled data according the UN Stats metadata. This will only include tier 1 indicators.

  3. Combine the resulting data into a single set of indicators by calculating the average across the gender SDGs.

Below is a paraphrased description from the UN stats webpage (https://unstats.un.org/sdgs/indicators/indicators-list/):

The global indicator framework for Sustainable Development Goals was developed by the Inter-Agency and Expert Group on SDG Indicators (IAEG-SDGs) and agreed upon at the 48th session of the United Nations Statistical Commission held in March 2017.

The global indicator framework includes 231 unique indicators. Please note that the total number of indicators listed in the global indicator framework of SDG indicators is 247. However, twelve indicators repeat under two or three different targets.

For each value of the indicator, the responsible international agency has been requested to indicate whether the national data were adjusted, estimated, modelled or are the result of global monitoring. The “nature” of the data in the SDG database is determined as follows:

  • Country data (C): Produced and disseminated by the country (including data adjusted by the country to meet international standards);

  • Country data adjusted (CA): Produced and provided by the country, but adjusted by the international agency for international comparability to comply with internationally agreed standards, definitions and classifications;

  • Estimated (E): Estimated based on national data, such as surveys or administrative records, or other sources but on the same variable being estimated, produced by the international agency when country data for some year(s) is not available, when multiple sources exist, or when there are data quality issues;

  • Modelled (M): Modelled by the agency on the basis of other covariates when there is a complete lack of data on the variable being estimated;

  • Global monitoring data (G): Produced on a regular basis by the designated agency for global monitoring, based on country data. There is no corresponding figure at the country level.

For each indicator, we will produce a value for each country with the following coding scheme:

  • 1 Point: Indicator exists and the value is based on the country, country data adjusted, or estimated or Global Monitoring data
  • 0 Points: Indicator based on modeled data or does not exists

We give countries no credit for modeled data, because the country did not produce indicators in a form that was directly usable for reporting on an SDG indicator.

When we average over all indicators in a goal to get a score, we compute a 5 year moving average to avoid year to year variability in reporting for SDGs. The overall score for an SDG is then the 5 year average of the percentage of indicator values based on country, country data adjusted, or estimated or Global Monitoring data that were available for the SDG.

Figure 1. Percentage of SDG Gender Indicators available between 2015 and 2020. Each point corresponds to country

Figure 2 and 3 below break up the availability of gender related indicators into Tier 1 and Tier 2 indicators.

Figure 2. Percentage of SDG Gender Tier 1 Indicators available between 2015 and 2020. Each point corresponds to country

As shown in Figure 3, countries significantly lag in the production of Tier 2 gender indicators.

Figure 3. Percentage of SDG Gender Tier 2 Indicators available between 2015 and 2020. Each point corresponds to country

Next, the availability of gender indicators in SDG 5 on gender equality are shown. This contains both the tier 1 and tier 2 indicators.

Figure 4. Percentage of SDG 5 (Gender Equity) Indicators available between 2015 and 2020. Each point corresponds to country

1.3 Coverage Gaps for SDG Indicators requiring Gender Breakdown

For indicators that require a breakdown by sex, in several cases there are large gaps between the percentage of countries with a recent value for females and whether a value for any sex or both sexes combined exists. For instance, while SDG indicator 1.2.1, on the proportion of population living below the national poverty line, specifies that a breakdown by sex is available. This information is not currently available in the UN global SDG database. Additionally, for SDG indicator 1.3.1, on Proportion of population covered by social protection floors/systems, by sex, 79.4% of countries have a recent value for any/both sexes, but only 7.8% of countries have a recent value for females. To cite another example, for SDG indicator 4.1.1, Proportion of children and young people (a) in grades 2/3; (b) at the end of primary; and (c) at the end of lower secondary achieving at least a minimum proficiency level in (i) reading and (ii) mathematics, by sex, 62.4% of countries have a value for any/both sexes, but only 52.3% of countries have a value for females.

2 Comparison of Gender Data Availability and SPI Scores

Next, we assess how a country’s overall statistical performance relates to the availability of gender data, and identified countries that may have strong systems overall but are underperforming on gender statistics. To do so, we compare the availability of gender statistics to scores on the World Bank’s Statistical Performance Indicators (SPI).

The World Bank’s Statistical Performance Indicators (SPI) measure statistical performance across 174 countries. The indicators are grouped into five pillars: (1) data use, which captures the demand side of the statistical system; (2) data services, which looks at the interaction between data supply and demand such as the openness of data and quality of data releases; (3) data products, which reviews whether countries report on important indicators; (4) data sources, which assesses whether censuses, surveys, and other data sources are created; and (5) data infrastructure, which captures whether foundations such as financing, skills, and governance needed for a strong statistical system are in place. Within each pillar is a set of dimensions, and under each dimension is a set of indicators to measure performance. The indicators provide a time series extending at least from 2016 to 2019 in all cases, with some indicators going back to 2004. The data for the indicators are from a variety of sources, including databases produced by the World Bank, International Monetary Fund (IMF), United Nations (UN), Partnership in Statistics for Development in the 21st Century (PARIS21), and Open Data Watch—and in some cases, directly from national statistical office websites. The indicators are also summarized as an index, termed the SPI overall score, with scores ranging from a low of 0 to a high of 100.

There is a positive relationship between countries SPI overall scores and the availability of gender data as indicated in Figure 3.

Figure 3. Plot of SPI overall score on Availability of Gender SDG Indicators

To highlight countries where these relationships do not hold as well, the next figure shows the 15 countries that most over-perform and the 15 countries that most under-perform on the availability of gender data compared to their SPI overall score.

To produce this figure, we use OLS regression to estimate a linear model of the availability of gender SDG data on the SPI overall score in 2019. The residual can be interpreted as the difference between the country’s availability of gender data and the expected availability given their SPI overall score. Countries with values of the residual greater than zero are over-performing based on their SPI overall score and countries with residuals less than zero are under-performing.

Figure 4: Top 15 Over/Under-Performers on SPI overall score compared to the Availability of Gender SDG Indicators in 2019

3 Data Sources for Gender Indicators

In what will follow, the most common data sources for each of the SDG indicators in Table 1 will be discussed. This can provide some guidance on the types of data sources that are typically used to report on these indicators, and reveal gaps that may exists for specific countries. Data sources are based on the metadata in the UN Global SDG Indicator Database.1

Now show a table for the top 10 for each indicator. In some cases, the sources are unique for each country. In that case, a set of 5 are chosen.

Dang, Hai-Anh H, and Umar Serajuddin. 2020. “Tracking the Sustainable Development Goals: Emerging Measurement Challenges and Further Reflections.” World Development 127: 104570.
Division, United Nations Statistics. 2021. “The Sustainable Development Goals Report 2020.”

  1. Data was pulled from the UN Global SDG Indicators database on July 19, 2021.↩︎